Nowadays, neural networks are widely used in many applications as artificial intelligence models for learning tasks. Since typically neural networks process a very large amount of data, it is convenient to formulate them within the mean-field and kinetic theory. In this work we focus on a particular class of neural networks, i.e. the residual neural networks, assuming that each layer is characterized by the same number of neurons N, which is fixed by the dimension of the data. This assumption allows to interpret the residual neural network as a time-discretized ordinary differential equation, in analogy with neural differential equations. The mean-field description is then obtained in the limit of infinitely many input data. This leads to a Vlasov-type partial differential equation which describes the evolution of the distribution of the input data. We analyze steady states and sensitivity with respect to the parameters of the network, namely the weights and the bias. In the simple setting of a linear activation function and one-dimensional input data, the study of the moments provides insights on the choice of the parameters of the network. Furthermore, a modification of the microscopic dynamics, inspired by stochastic residual neural networks, leads to a Fokker-Planck formulation of the network, in which the concept of network training is replaced by the task of fitting distributions. The performed analysis is validated by artificial numerical simulations. In particular, results on classification and regression problems are presented.

MEAN-FIELD AND KINETIC DESCRIPTIONS OF NEURAL DIFFERENTIAL EQUATIONS / Herty, M.; Trimborn, T.; Visconti, G.. - In: FOUNDATIONS OF DATA SCIENCE. - ISSN 2639-8001. - 4:2(2022), pp. 271-298. [10.3934/fods.2022007]

MEAN-FIELD AND KINETIC DESCRIPTIONS OF NEURAL DIFFERENTIAL EQUATIONS

Herty M.;Visconti G.
2022

Abstract

Nowadays, neural networks are widely used in many applications as artificial intelligence models for learning tasks. Since typically neural networks process a very large amount of data, it is convenient to formulate them within the mean-field and kinetic theory. In this work we focus on a particular class of neural networks, i.e. the residual neural networks, assuming that each layer is characterized by the same number of neurons N, which is fixed by the dimension of the data. This assumption allows to interpret the residual neural network as a time-discretized ordinary differential equation, in analogy with neural differential equations. The mean-field description is then obtained in the limit of infinitely many input data. This leads to a Vlasov-type partial differential equation which describes the evolution of the distribution of the input data. We analyze steady states and sensitivity with respect to the parameters of the network, namely the weights and the bias. In the simple setting of a linear activation function and one-dimensional input data, the study of the moments provides insights on the choice of the parameters of the network. Furthermore, a modification of the microscopic dynamics, inspired by stochastic residual neural networks, leads to a Fokker-Planck formulation of the network, in which the concept of network training is replaced by the task of fitting distributions. The performed analysis is validated by artificial numerical simulations. In particular, results on classification and regression problems are presented.
2022
continuous limit; kinetic equation; machine learning; mean-field equation; residual neural network
01 Pubblicazione su rivista::01a Articolo in rivista
MEAN-FIELD AND KINETIC DESCRIPTIONS OF NEURAL DIFFERENTIAL EQUATIONS / Herty, M.; Trimborn, T.; Visconti, G.. - In: FOUNDATIONS OF DATA SCIENCE. - ISSN 2639-8001. - 4:2(2022), pp. 271-298. [10.3934/fods.2022007]
File allegati a questo prodotto
File Dimensione Formato  
Herty_Mean-field_2022.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 1.8 MB
Formato Adobe PDF
1.8 MB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1659731
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 4
social impact